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ABSTRACT 

We consider the indifference-zone (IZ) formulation of the ranking and selection problem in which the goal 
is to choose an alternative with the largest mean with guaranteed probability, as long as the difference 
between this mean and the second largest exceeds a threshold. Conservatism leads classical IZ procedures 
to take too many samples in problems with many alternatives. The Bayes-inspired Indifference Zone 
(BIZ) procedure, proposed in Frazier (2014), is less conservative than previous procedures, but its proof 
of validity requires strong assumptions, specifically that samples are normal, and variances are known 
with an integer multiple structure. In this paper, we show asymptotic validity of a slight modification 
of the original BIZ procedure as the difference between the best alternative and the second best goes to 
zero, when the variances are known and finite, and samples are independent and identically distributed, 
but not necessarily normal. 


1 INTRODUCTION 


There are many applications where we have to choose the best alternative among a finite number of 
simulated alternatives. For example, in inventory problems, we may want to choose the best inventory 
policy (s, S) for a finite number of values of s and S. This is called the ranking and selection problem. A 
good procedure for addressing this problem should be both efficient and accurate, i.e. it should balance 
the number of samples it takes with the quality of its selection. 

This paper considers the indifference-zone (IZ) formulation of the ranking and selection problem, in 
which we require that a procedure satisfy the IZ guarantee, i.e., that the best system be chosen with 
probability larger than some threshold P* given by the user, when the distance between the best system 
and the others is larger than some other user-specified threshold 5 > 0. The set of problem configurations 


satisfying this constraint on the difference in means is called the preference zone. The paper Bechhofer 


(1954) is considered the seminal work, and early work is presented in the monograph Bechhofer, Kiefer, 
and Sobel (1968). Some compilations of the theory developed in the area can be found in R. E. Bechhofer 


(1995), Swisher, Jacobson, and Yiicesan (2003), Kim and Nelson (2006) and Kim and Nelson (2007). 


Other approaches, beyond the indifference-zone approach, include the Bayesian approach (Frazier 2012), 
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the optimal computing budget allocation approach (Chen and Lee 2010), the large deviations approach 
(Glynn and Juneja 2004), and the probability of good selection guarantee (Nelson and Banerjee 2001). 
The last approach is similar to the indifference-zone formulation, but provides a more robust guarantee. 
A good IZ procedure satishes the IZ guarantee and requires as few samples as possible. The first 


IZ procedures presented in 

Bechhofer (1954), Paulson (1964), |Fabian (1974), Rinott (1978), Hartmann 

( iOSN 

), partmann (1991), 

Paulson (1994) satisfy the IZ guarantee, but they usually take too many 


samples when there are many alternatives, in part because they are conservative: their probability of 
correct selection (PCS) is much larger than the probability specihed by the user (Wang and Kim 2013). 
One reason for this is that these procedures use Bonferroni’s inequality, which leads then to sample 
more than necessary. The Bonferonni-based bounds underlying these procedures become looser, and 
the tendency to take more samples than necessary increases, as the number of alternatives grow. More 


recently, new algorithms were developed in Kim and Nelson (2001), Goldsman et al. (2002), Hong 


(2006), and they improve performance but they still use Bonferroni’s inequality, and so the methods 
are inefficient when there are many alternatives. Procedures in Kim and Dieker (2011|), Dieker and 


Kim (20T^ do not use Bonferroni’s inequality when there are only three alternatives, but again use 


Bonferroni’s inequality when comparing more than three alternatives. 

In addition to Bonferroni’s inequality, two other common sources of conservatism in indifference-zone 
ranking and selection procedures are the change from discrete time to continuous time often used to 
show IZ guarantees, and the fact that typically, the conhguration under consideration is not a worst- 
case conhguration (Wang and Kim 2013). The difference between worst and typical cases tends to 
contribute the most to conservatism, with Bonferonni’s inequality contributing second-most, and the 
continuous/discrete time difference contributing the least (Wang and Kim 2013). Although the dif¬ 
ference between the worst and typical cases is the largest contributor to conservatism, all indifference 
zone procedures must meet the PCS guarantee for all conhgurations in the preference zone, including 
worst-case conhgurations, and so this source of conservatism is fundamental to the indifference-zone for¬ 
mulation. Thus, eliminating the use of Bonferroni’s inequality remains an important route for reducing 
conservatism while still retaining the indifference-zone guarantee. 


Frazier (2014) presents a new sequential elimination IZ procedure, called BIZ (Bayes-inspired Indif¬ 


ference Zone), that eliminates the use of Bonferroni’s inequality, reducing conservatism. This procedure’s 
lower bound on the worst-case probability of correct selection in the preference zone is tight in continuous 
time, and almost tight in discrete time. In numerical experiments, the number of samples required by 
BIZ is signihcantly smaller than that of procedures like the procedure of|Bechhofer, Kiefer, and Sobel 


(1968) and the KN procedure of Kim and Nelson (200l|, especially on problems with many alternatives. 


Unfortunately, the proof from Frazier (2014) that the BIZ procedure satishes the IZ guarantee for the 
discrete-time case assumes that (1) samples are normally distributed; (2) variances are known; and (3) 
the variances are either common across alternatives, or have an unrealistic integer multiple structure. 

The contribution of this work is to prove the asymptotic validity of the BIZ procedure as S goes 
to zero, retaining the assumption of known variances, but replacing assumptions (1) and (3) by the 
much weaker assumption of independent and identically distributed hnite variance samples. Thus, our 
proof allows a much broader set of sampling distributions than that allowed by Frazier (2014), including 
non-normal samples and general heterogeneous variances. We also show that this bound on worst-case 
PCS is asymptotically tight as 6 goes to zero, showing that the BIZ procedure successfully eliminates 
conservatism due to Bonferonni’s inequality in this more general setting, just as was demonstrated by 


Frazier (2014) for more restricted settings. 


To simplify our analysis, we analyze a slight modihcation of the version of the BIZ procedure pre¬ 
sented in Frazier (2014), which keeps a certain parameter hxed rather than letting it vary as did 


Frazier (2014). Numerical experiments on typical cases show little difference in performance between 


the version of BIZ we analyze and the version in Frazier (2014). We conjecture that a proof technique 
similar to the one presented here can be used to show asymptotic validity of the BIZ procedure when 
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the variances are unknown, and we present numerical experiments that support this belief. 

This paper is organized as follows: In section we recall the indifference-zone ranking and selection 
problem. In section]^ we recall the Bayes-inspired IZ (BIZ) procedure from Frazier (2014). In section 
1^ we present the proof of the validity of the algorithm when the variances are known. In section we 
present some numerical experiments. In section we conclude. 


2 INDIFFERENCE-ZONE RANKING AND SELECTION 

Ranking and Selection is a problem where we have to select the best system among a hnite set of 
alternatives, i.e. the system with the largest mean. The method selects a system as the best based 
on the samples that are observed sequentially over time. We suppose that samples are identically 
distributed and independent, over time and across alternatives, and each alternative x has mean /ij.. We 
dehne /r = (/ii,... ,/ifc). 

If the best system is selected, we say that the procedure has made the correct selection (CS). We 
define the probability of correct selection as 

PCS (/i) = (f G arg max^/ia-) 

where x is the alternative chosen by the procedure and P^ is the probability measure under which 
samples from system x have mean and hnite variance A^. 

In the Indifference-Zone Ranking and Selection, the procedure is indifferent in the selection of a 
system whenever the means of the populations are nearly the same. Formally, let /i = [/ii, ... ,/ifc] be 
the vector of the true means, the indifference zone is dehned as the set |/i G : /i[fc] — /i[fc-i] < 5}. The 
complement of the indifference zone is called the preference zone (PZ) and 5 > 0 is called the indifference 
zone parameter. We say that a procedure meets the indifference-zone (IZ) guarantee at P* G (l//c, 1) 
and (5 > 0 if 

PCS (/i) > P* for all /i G PZ (5). 

We assume P* > 1/k because IZ guarantees can be meet by choosing x uniformly at random from 

{ 1 , 


3 THE BAYES-INSPIRED IZ (BIZ) PROCEDURE 


BIZ is an elimination procedure. This procedure maintains a set of alternatives that are candidates 
for the best system, and it takes samples from each alternative in this set at each point in time. At 
beginning, all alternatives are possible candidates for the best system, and over the time alternatives 
are eliminated. The procedure ends when there is only one alternative in the contention set and this 
remaining alternative is chosen as the best. It is shown in Frazier (2014) that the algorithm ends in a 
hnite number of steps with probability one. 


Frazier (2014) shows that the BIZ procedure satishes the IZ guarantee under the assumptions that 


samples are normally distributed, variances are known, and the variances are either common across 
alternatives, or have an integer multiple structure. The continuous time version of this procedure also 
satishes the IZ guarantee, with a tight worst-case preference-zone PCS bound. 

A slight modihcation of the discrete-time BIZ procedure for unknown and/or heterogeneous sampling 
variances is given below. This algorithm takes a variable number of samples from alternative x at time t, 
and Utx is this number (its dehnition may found in the algorithm given below). This algorithm depends 
on a collection of integers Bi,..., Bk, P*,c,6 and no- Here, no is the number of samples to use in 
the hrst stage of samples, and 100 is the recommended value for uq when the variances are unknown. 
The paramater B^ controls the number of samples taken from system x in each stage. To simplify 
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our analysis, the procedure presented is a slight modihcation of the original BIZ procedure (Frazier 


2014) where z G arg max 3 ,g^A^, instead of z G arg miai^^xntx /According to numerical experiments 


on common cases, there is little difference in the PCS between the version of BIZ we analyze and the 
version in [Frazier (2014| ). 

For each f, x G {1,..., fc}, and subset Ac {1,..., fc}, we dehne a function 


(^) = exp 5l3t— 
ntx 



where is the sample variance of all samples from alternative x thus far, and Ztx = Ynt^^x is the sum 
of the samples from alternative x observed by stage t. 


Algorithm: Discrete-time implementation of BIZ, for unknown and/or heterogeneous vari¬ 
ances. 

Require: c G [0,1 — (P*)^], 5 > 0, P* G (l//c, 1), Uq > 0 an integer, strictly positive 

integers. Recommended choices are c = 1 — , Bi = ■ ■ ■ = B^ = 1 and hq between 10 and 30. 

If the sampling variances A^ are known, replace the estimators with the true values A^, and set 

no = 0. 

1: For each x, sample alternative x no times and set riQx t— no. Let Wqx and Aqj, be the sample mean 
and sample variance respectively of these samples. Let f 0. Let z G arg max^g^A^, where A^ is 
the empirical estimator of the variance A^ using no samples if x G A. 

2: Let A ^ {1,. .., k}, P P*. 

3: while X G m.a.’Xx^AQtx (^) < P do 

4: while minx^AQtx (^) < c do 

5: Let X G arg min^g^gta, (A). 

6: Let P ^ P/{1 — qtx (A)). 

7: Remove x from A. 

8: end while 

9: For each x G A, let Ut+i^x = ceil {^lx{ntz +B,)/Xl)j. 

10: For each x G A, if nt+i,x > ntx, take nt+i,x — ntx additional samples from alternative x. Let kFt+i,a; 

and A^^^ ^ be the sample mean and sample variance respectively of all samples from alternative x 
thus far. 

11: Increment t. 

12: end while 

13: Select X G arg max^^j^Ztx/ntx as our estimate of the best. 


This algorithm generalizes the BIZ procedure with known common variance. In that case, we have 
■ = Bk = 1 and ritx = t. The algorithm can be generalized to the continuous case (see 
))• 


that Pi = 


Frazier (2014 


4 ASYMPTOTIC VALIDITY WHEN THE VARIANCES ARE 
KNOWN 

In this section we prove that the BIZ procedure satishes asymptotically the IZ guarantee when the 
variances are known. This means that we consider a collection of ranking and selection problems 
parametrized by 5 > 0. For the problem given 6, we suppose that the vector of the true means 
/i = [fii,, pfc] is equal to 6a for some hxed a G that does not depend on 6 and > ak-i > ■ ■ ■ > Ui, 
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Ofc — ttk-i > 1. Moreover, the variances of the alternatives are hnite, strictly greater than zero and do 
not depend on 5. We also suppose that samples from system x G {1..., fc} are identically distributed 
and independent, over time and across alternatives. We also dehne \\ := maxjg{i,,, A^. 

Any ranking and selection algorithm can be viewed as mapping from paths of the fc-dimensional 
discrete-time random walk {Yt^ : f G N, x G {1,..., k}) onto selection decisions. Our proof uses this 
viewpoint, noting that the BIZ procedure’s mapping from paths onto selections decisions is the compo¬ 
sition of three simpler maps. 

The hrst mapping is from the raw discrete-time random walk {Ytx : f G N, x G {1,..., k}) onto a time 
changed version of this random walk, written as [Ztx : f G N, x G {1,..., /c}), where we recall Ztx = 
is the sum of the samples from alternative x observed by stage t. 

The second one maps this time-changed random walk through a non-linear mapping for each f, x 
and subset A C {1,..., fc}, to obtain (A) : f G N, A C {1,..., fc} , x G A), where 


(Itx (^) = exp 5l3t — 

ntx 



exp ( 5l3t— ) := q' {{Ztx : x E A),6,t) 

x'eA ^ ' 


where we note that nx {t) and jSt are deterministic in the version of the known-variance BIZ procedure 
that we consider here. 

The third one maps the paths of (A) : t eN, A C {1,... ,k} ,x E A) onto selection decisions. 
Specihcally, this mapping begins with Aq = {1,..., /c}, Pq = P*, and hnds the hrst time ti that q't^{Ao) 
falls above the threshold Pq, or below the threshold c. If the hrst case occurs, the alternative with the 
largest q'^^ a;(^o) is selected as the best. If the second case occurs, the alternative with the smallest 
q'^^ x{Aq) is eliminated, resulting in a new set Ai, a new selection threshold Pi is calculated from Pq 
and the eliminated alternative’s value of q'^^ 3 ;(Ao), and the process continues. This process is repeated 
until an alternative is selected as the best. Call this mapping h, so that the BIZ selection decision is 
h{{qtx (A) : t G N,A C {l,...,fc},x G A)). 


4.1 Proof Outline 

Based on this view of the BIZ procedure as a composition of three maps, we outline the main ideas of 
our proof here. 

Our proof hrst notes that the same selection decision is obtained if we apply the BIZ selection map 
h to a time-changed version of (A) : f G N, A C {1,..., fc} , x G A), specihcally to 

{qtx {A) :tE A C {1, ..., fc} , x G A) , 

where qtx (A) := q' : x E A^ ,S,ty 

This discrete-time process is interpolated by the continuous-time process 

{qtx (A) : f G M, A C {1,..., A:} , x G A). (1) 

If we apply the BIZ selection map h to this continuous-time process, the selection decision will diher 
from BIZ’s selection decision for <5 > 0, but we show that this diherence vanishes as 5 —?• 0. Thus, our 
proof focuses on showing that, as ^ 0, applying the BIZ selection map h to Q produces a selection 

decision that satishes the indiherence-zone guarantee. 

To accomplish this, we use a functional central limit theorem for Z_^^, which shows that a centralized 
version of Z_^^ converges to a Brownian motion as S goes to 0. This centralized version of is 
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Rewriting in terms of {S,t) and substituting into the definition of qtx{^) provides the expression 

qtx (v4) = g (^Cx (5, ^ ^ 6ax: X e , 5, . (2) 

We will construct a mapping / (•, 5) that takes as input the process {Cx (5, f) : x G {1,..., k},t G M), 
calculates Q from it, applies the BIZ selection map h to Q, and then returns 1 if the correct selection 
was made, and 0 otherwise. Thus, the correct selection event that results from applying the BIZ selection 
map h to Q is the result of applying the mapping / (•, 5) to the paths t ^ Cx (5, t) . 

With these pieces in place, the last part of our proof is to observe that (1) C {6, •) converges to a 
multivariate Brownian motion W as 5 goes to 0; (2) the function / has a continuity property that causes 


where g is the selection decision from applying the BIZ procedure in continuous time; and (3) the BIZ 
procedure satisfies the IZ guarantee when applied in continuous time (Theorem 1 in 
and so E[g{W)] > P* with equality for the worst configurations in the preference zone 


Frazier (2014| )), 


4.2 Preliminaries for the Proof of the Main Theorem 


In this section, we present preliminary results and definitions used in the proof of the main theorem; 
first, a central limit theorem Corollary]^ second, definitions of the functions and g(-); third, a 

continuity result Lemma and fourth, a result Lemma that allows us to change from discrete-time 
processes to continuous-time processes. 

First, we are going to see that the centralized sum of the output data Cx{S, t) converges to a Brownian 
motion in the sense of D^o '■= T)[0, oo), which is the set of functions from [0, oo) to M that are right- 
continuous and have left-hand limits, with the Skorohod topology. The definition and the properties of 
this topology may be found in Chapter 3 of Billingsley (1999| ). 

We briefly recall the definition of convergence of random paths in the sense of Doo- Suppose that we 
have a sequence of random paths (T’n)^o such that : i? —Doc where (hi, P, P) is our probability 
space. We say that Ak ^ iu the sense of D^o if Pn ^ Po where Pn : Voo —t [0,1] are defined as 
Pn [A] = P (4)] for all n > 0 and V^o are the Borel subsets for the Skorohod topology. 

The following lemma shows that the centralized sum of the output data with t changed by t/5‘^ 
converges to a Brownian motion in the sense of Doo- 

Lemma 1. Let x E {1... ,k}, then 

Cx (6, •) ^ Wx (•) 

as S 0 in the sense of D[0, oo), where Wx is a standard Brownian motion. 

Proof. By Theorem 19.1 of [Billingsley (1999 ), 


Pnx(t),x ^a| (' 52 )^ 


A. 


Wx 


in the sense of /1[0, cxd). 

Fix ta G fl. Observe that 




floor(f (-^ 


floor (I (■^))/i. 


V 


ceil(|(.^ 




A. 


A, 


^ 0 
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uniformly in [0, s] for all s > 0 and then by Theorem A.2 



^ W, (■) 


in the sense of -D[0, cxd). 


Since 



—)■ 0 uniformly on [0, s] for every s > 0, then by Theorem A.2 



^ W, (•). 


Finally, observe that for hxed u E f2, 


-)■ 




M /a 
Az V <5^ 



(no§) /i 


X 


uniformly in [0,f] for all t > 0, and so by Theorem A.2 the result follows. 


□ 


Now, we use the product topology in [0, cxd) for k E N. This topology may be described as 
the one under which —»■ (Zq,...,Zq) if and only if Z* —)■ Zq for all i E 

See the Miscellany of Billingsley (1968). The following corollary follows from the previous result and 
independence. 


Corollary 1. ITe have that 


C (5, ■) := (C. (5, ^ W (■) := (IT. (O).,^ 

as 6 0 in the sense of D^. 

Now that we have obtained this functional central limit theorem for C {6, ■), we now continue along 
the proof outline and dehne the function /(-, 5) that was sketched there. This function has three parts: 
hrst, computing a “non-centralized” path from an arbitrary input “centralized” path in D [0, oo)^; second, 
applying the BIZ selection map h to this non-centralized path; and third, reporting whether selection 
was correct or not. 

To accomplish the hrst part, for each F E D[0, oo)^, we dehne (A) as 

Qtx (^) = ^ ^ (5a. : a; e , 5, A C {1,..., . 

Note that if we replace F by C ((5, f), we get qtx (A) in ([^. 
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To accomplish the second and third parts, we dehne / {F, 5) to be obtained by applying the BIZ 
selection map h to the process (^) ■ t E M., A C. {1,..., /c} , a; G , and then reporting whether 
the selection was correct. More precisely, f{F, 5) is dehned to be 

if h (A) : t e M,A C {l,...,/c},x G A^^ = k, 

otherwise. 


f(F,i) = 


0 


We now construct a function g{-) that, when applied to the path of a fc-dimensional standard Brow¬ 
nian motion, will be equal in distribution to the indicator of the correct selection event from the 


continuous-time BIZ procedure from Frazier (2014) to a transformed problem that does not depend 
on 6. 

We construct g analogously to /(•, 6), but we replace the path used in the construction of /(•, 6) 
by a new path that doesn’t depend on <5, and is obtained by taking the limit as 5 —)■ 0. This path is 


qL (^) := exp 


F^t) 


+ 


/E 

x' gA 


Fx'{t) , 1 
exp ( + —ta^ 


Then, g is dehned to be 


^ iih{{qf^{A):teR,Ac{l,...,k},xEA))=k, 

^ 1 0 otherwise. 


In the proof of the main theorem, we will show that 


f{C{6,-),S)^g{W) 


as 5 —)■ 0 in distribution. We will use the following lemma, which shows a continuity property. A proof 
of Lemma may be found in a full version of this paper (Toscano-Palmer in and Frazier 2015), which 
will be submitted soon to arXiv. 


Lemma 2. Let C (0, cxd) such that —)■ 0. If Dg = {Z G D [0, oo)^ : if {Zn} C D [0, oo)^ and 

lirrindoo {Zn, Z) = 0 , then the sequence {/ (Z„, (5„)} converges to {g (Z)}}, then P {WEDg) = 1. 


The following lemma shows that the difference in the correct selection events obtained from applying 
the BIZ selection map h to the discrete-time and continuous-time versions of qtx{I^) vanish as 6 goes 
to 0. A proof of Lemma [sj may be found in a full version of this paper 
20l5| ). 

Lemma 3. hm5_,.oIP {h {{q'tx (^) ■ t E N, A C {I,..., k} , x E A'^) = k) = hm5_,.oIP (/ (C (5, t) , 5) = 1). 


(Toscano-Palmerin and Frazier 


4.3 The Main Result 

Theorem 1. If samples from system x E {1..., fc} are identically distributed and independent, over 
time and across alternatives, then lims^oPCS{6) > P* provided pik = 0‘kh,pik-i = o>k-ih, • • • ,/ii = 

Ok > Ok-i > ■ ■ ■ > cq, Ufc — Ufc-i > 1, and the variances are finite and do not depend on 6. 

Furthermore, 

inf lims^oPCS{6) = P* 
aePZ(l) 

where PZ (1) = |a G — au-i > 1, Ofc > Ok-i > ■ ■ ■ > Oi} ■ 








Proof. Using the definitions given at the beginning of this section, the selection decision of the discrete- 
time BIZ procedure for a particular 5 > 0 when = OkS, Hk-i = ak-iS ,..., /ii = ai6 is given by 

(A) :t A;}, X G Zljj 

and the probability of correct selection PCS((5) is 

PCS((5) =p(h (A) :feN,Zlc = A:). 

By Lemma 1^ we have that 

lim PCS(5) = lim P (/ (C (5, t),6) = l). (3) 

(5^0 (5—>-0 

We also have, by Lemma and an extension of the continuous mapping theorem (Theorem 5.5 of 
Billingsley (1968|)), 

f{C{5,t),6)^g{W{t)) 
in distribution as 5 —)■ 0. This implies that 


rimF{f{C{6,t),6) = l)=F{g{W) = l) 

5^0 


(4) 


The random variable g{W) is equal in distribution to the indicator of the event of correct selection 


that results from applying the continuous-time BIZ procedure from Frazier (2014) in a problem with 
indifference-zone parameter equal to 1, where each alternative’s observation process has volatility and 
drift ttx- This can be seen by noting that the path {q^{A) : t > 0) defined above is equal in distribution 


to the path {qtx{A) : t > 0) defined in equation (2) of Frazier (2014), and that the selection decision of 


the continuous-time algorithm in Frazier (2014) is obtained by applying h to this path. 


Theorem 1 in Frazier (2014) states that 

F{g{W) = 1)>P* 


(5) 


Combining (§. 0 . and (|^, we have 


limPCS((5) > P*. 

5—^0 


Furthermore, Theorem 1 in 


Frazier (2014[) shows that 


mfaePZ(i)F{g{W) = l) = P* 

where PZ (1) = {a G : a*, — Uk-i > l}- 
Combining (|^, (|^, and (|^, shows 


( 6 ) 


inf limPCS(5) = PL 

aePZ(l) (5-s-O 


□ 
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(a) Known heterogeneons vari¬ 
ances, A| = 0.25, Af = 1, no = 0. 



(b) Unknown highly heteroge¬ 
neons variances, A| = 100, Af = 
1, no = 15. 



(c) Known highly heterogeneons 
variances, A| = 100, Af = 1, no = 
0 . 


Figure 1; The PCS of the BIZ procedure versus 6 for three different slippage configurations with 100 
alternatives and P* = 0.9. We observe in all three examples that the PCS converges to P* as S goes 
to 0. The first example (a) shows typical behavior, where the PCS is above P* for all values of 6. The 
second (b) and third (c) examples are atypical, and were chosen specially to illustrate that BIZ can 
underdeliver on PCS in slippage configurations when uq is small and the variance of the best alternative 
is much larger than the variance of the other alternatives. 


5 NUMERICAL EXPERIMENTS 


We now use simulation experiments to illustrate and further investigate the phenomenon characterized by 
Theorem]^ Using the version of BIZ described in Section [^with maximum elimination (c = 1 — {P*)’^-^), 
we estimate and then plot the PCS as a function of 6. In all examples, P* = 0.9, PCS was estimated 
using 10,000 independent replications, and confidence intervals have length at most 0.014. 

Our first example, illustrated in Figure Im, is a known variance slippage configuration where the 
variance of the best alternative is 1/4 of the variance of the worst alternative. Specifically, we consider 
100 systems with independent normally distributed samples, where A^fc-i = 0,..., /ii = 0, 5 is 


within the interval [0.1,10], and Aioo = 1, Agg = 1 -|- gg^^^ ; • • • ) Ai = 0.5. Here, rig = 0. Figure 

shows that in this example the IZ guarantee is always satisfied. Moreover, the PCS approaches P 
6 goes to zero, as predicted by Theorem [T| When S is big enough, the PCS is almost one because the 
difference between the best system and the others is large enough to be easily identifiable by the BIZ 
procedure. 

Our second example, illustrated in Figure [Tbt is an unknown variance slippage configuration where 
the variance of the best alternative is 100 times larger than the variance of the other alternatives. 
Although Theorem [^applies only to the known-variance version of BIZ, we conjecture that the unknown- 
variance version of BIZ should exhibit similar behavior. In this example, we consider 100 systems with 
independent normally distributed samples, where /xioo = 5, /^gg = 0,..., /Xi = 0, 5 is within the interval 
[0.1,10], and Aioo = 10, Agg = ■ ■ ■ = Ai = 1. We set ng = 15. As 6 goes to 0, we observe that the PCS 
converges to P*, as it did in the known-variance slippage configuration example. In this example, we 
have intentionally chosen rig to be smaller than the recommended value of 100, and have chosen a large 
variance for the best system, to cause BIZ to fail to meet the IZ guarantee for 5 > 0. Increasing the 
parameter ng typically causes BIZ to meet the IZ guarantee for all S, and we recommend a larger value 
of ng in practice. The choice of ug, and its impact on PCS, merits further study. 

uses the same sampling distributions as the second 


la 


as 


Our third example, illustrated in Figure Ic 


example, but assumes the variances are known, and sets ng = 0. The effect of this change, and especially 
of setting ng to 0, is to cause significant underdelivery on PCS for large values of 6. As remarked 
above, this example was chosen specially to illustrate that BIZ can underdeliver on PCS in slippage 
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configurations when no is small, and the variance of the best alternative is much larger than the variance 
of the worst alternative. However, as predicted by Theorem Q1 the PCS converges to P* as 5 grows 
small, even in this pathological case. 


6 CONCLUSION 


We have proved the asymptotic validity of the Bayes-inspired Indifference Zone procedure (Frazier 2014) 
when the variances are known. This algorithm has been observed empirically to take fewer samples than 
other IZ procedures, especially for problems with large numbers of alternatives, and so characterizing 
when it satishes the indifference-zone guarantee is important for understanding when it should be used 
in practice. 
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